Search CORE

90 research outputs found

Integrated multiple sequence alignment

Author: Sammeth Michael
Publication venue: Bielefeld University
Publication date: 01/01/2005
Field of study

Sammeth M. Integrated multiple sequence alignment. Bielefeld (Germany): Bielefeld University; 2005.The thesis presents enhancements for automated and manual multiple sequence alignment: existing alignment algorithms are made more easily accessible and new algorithms are designed for difficult cases. Firstly, we introduce the QAlign framework, a graphical user interface for multiple sequence alignment. It comprises several state-of-the-art algorithms and supports their parameters by convenient dialogs. An alignment viewer with guided editing functionality can also highlight or print regions of the alignment. Also phylogenetic features are provided, e.g., distance-based tree reconstruction methods, corrections for multiple substitutions and a tree viewer. The modular concept and the platform-independent implementation guarantee an easy extensibility. Further, we develop a constrained version of the divide-and-conquer alignment such that it can be restricted by anchors found earlier with local alignments. It can be shown that this method shares attributes of both, local and global aligners, in the quality of results as well as in the computation time. We further modify the local alignment step to work on bipartite (or even multipartite) sets for sequences where repeats overshadow valuable sequence information. In the end a technique is established that can accurately align sequences containing eventually repeated motifs. Finally, another algorithm is presented that allows to compare tandem repeat sequences by aligning them with respect to their possible repeat histories. We describe an evolutionary model including tandem duplications and excisions, and give an exact algorithm to compare two sequences under this model

Publications at Bielefeld University

ASTALAVISTA: dynamic and flexible analysis of alternative splicing events in custom gene datasets

Author: Foissac Sylvain
Sammeth Michael
Publication venue: Oxford University Press
Publication date: 01/01/2007
Field of study

In the process of establishing more and more complete annotations of eukaryotic genomes, a constantly growing number of alternative splicing (AS) events has been reported over the last decade. Consequently, the increasing transcript coverage also revealed the real complexity of some variations in the exon–intron structure between transcript variants and the need for computational tools to address ‘complex’ AS events. ASTALAVISTA (alternative splicing transcriptional landscape visualization tool) employs an intuitive and complete notation system to univocally identify such events. The method extracts AS events dynamically from custom gene annotations, classifies them into groups of common types and visualizes a comprehensive picture of the resulting AS landscape. Thus, ASTALAVISTA can characterize AS for whole transcriptome data from reference annotations (GENCODE, REFSEQ, ENSEMBL) as well as for genes selected by the user according to common functional/structural attributes of interest: http://genome.imim.es/astalavist

Crossref

PubMed Central

ProdInra

OpenDSU: Digital Sovereignty in PharmaLedger

Author: Alboaie Sînică
Sammeth Michael
Ursache Cosmin
Publication venue
Publication date: 29/09/2022
Field of study

Distributed ledger networks, chiefly those based on blockchain technologies, currently are heralding a next generation of computer systems that aims to suit modern users' demands. Over the recent years, several technologies for blockchains, off-chaining strategies, as well as decentralised and respectively self-sovereign identity systems have shot up so fast that standardisation of the protocols is lagging behind, severely hampering the interoperability of different approaches. Moreover, most of the currently available solutions for distributed ledgers focus on either home users or enterprise use case scenarios, failing to provide integrative solutions addressing the needs of both. Herein we introduce the OpenDSU platform that allows to interoperate generic blockchain technologies, organised - and possibly cascaded in a hierarchical fashion - in domains. To achieve this flexibility, we seamlessly integrated a set of well conceived OpenDSU components to orchestrate off-chain data with granularly resolved and cryptographically secure access levels that are nested with sovereign identities across the different domains. Employing our platform to PharmaLedger, an inter-European network for the standardisation of data handling in the pharmaceutical industry and in healthcare, we demonstrate that OpenDSU can cope with generic demands of heterogeneous use cases in both, performance and handling substantially different business policies. Importantly, whereas available solutions commonly require a pre-defined and fixed set of components, no such vendor lock-in restrictions on the blockchain technology or identity system exist in OpenDSU, making systems built on it flexibly adaptable to new standards evolving in the future.Comment: 18 pages, 8 figure

arXiv.org e-Print Archive

Evaluating Characteristics of De Novo Assembly Software on 454 Transcriptome Data: A Simulation Approach

Author: Bornberg-Bauer Erich
Feulner Philine G. D.
Mundry Marvin
Sammeth Michael
Publication venue: Public Library of Science
Publication date: 27/02/2012
Field of study

Background: The quantity of transcriptome data is rapidly increasing for non-model organisms. As sequencing technology advances, focus shifts towards solving bioinformatic challenges, of which sequence read assembly is the first task. Recent studies have compared the performance of different software to establish a best practice for transcriptome assembly. Here, we adapted a simulation approach to evaluate specific features of assembly programs on 454 data. The novelty of our study is that the simulation allows us to calculate a model assembly as reference point for comparison. Findings: The simulation approach allows us to compare basic metrics of assemblies computed by different software applications (CAP3, MIRA, Newbler, and Oases) to a known optimal solution. We found MIRA and CAP3 are conservative in merging reads. This resulted in comparably high number of short contigs. In contrast, Newbler more readily merged reads into longer contigs, while Oases produced the overall shortest assembly. Due to the simulation approach, reads could be traced back to their correct placement within the transcriptome. Together with mapping reads onto the assembled contigs, we were able to evaluate ambiguity in the assemblies. This analysis further supported the conservative nature of MIRA and CAP3, which resulted in low proportions of chimeric contigs, but high redundancy. Newbler produced less redundancy, but the proportion of chimeric contigs was higher. Conclusion: Our evaluation of four assemblers suggested that MIRA and Newbler slightly outperformed the othe

CiteSeerX

Public Library of Science (PLOS)

Directory of Open Access Journals

PubMed Central

Münstersches Informations und Archivsystem für Multimediale Inhalte

Differences in performance between Oticon MultiFocus Compact and ReSound BT2-E hearing aids

Author: Coughlin Maureen
Potts Lisa G
Sammeth Carol A
Valente Michael
Wagner-Escobar Michelle
Wynne Michael K
Publication venue: Digital Commons@Becker
Publication date: 01/01/1997
Field of study

Digital Commons@Becker

Based Upon Repeat Pattern (BURP): an algorithm to characterize the long-term evolution of Staphylococcus aureus populations based on spa polymorphisms

Author: Berssenbrügge Christoph
Harmsen Dag
Mellmann Alexander
Rothgänger Jörg
Sammeth Michael
Stoye Jens
Weniger Thomas
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

Mellmann A, Weniger T, Berssenbrügge C, et al. Based Upon Repeat Pattern (BURP): an algorithm to characterize the long-term evolution of Staphylococcus aureus populations based on spa polymorphisms. BMC MICROBIOLOGY. 2007;7(1):98.Background: For typing of Staphylococcus aureus, DNA sequencing of the repeat region of the protein A ( spa) gene is a well established discriminatory method for outbreak investigations. Recently, it was hypothesized that this region also reflects long-term epidemiology. However, no automated and objective algorithm existed to cluster different repeat regions. In this study, the Based Upon Repeat Pattern ( BURP) implementation that is a heuristic variant of the newly described EDSI algorithm was investigated to infer the clonal relatedness of different spa types. For calibration of BURP parameters, 400 representative S. aureus strains with different spa types were characterized by MLST and clustered using eBURST as "gold standard" for their phylogeny. Typing concordance analysis between eBURST and BURP clustering ( spa-CC) were performed using all possible BURP parameters to determine their optimal combination. BURP was subsequently evaluated with a strain collection reflecting the breadth of diversity of S. aureus (JCM 2002; 40: 4544). Results: In total, the 400 strains exhibited 122 different MLST types. eBURST grouped them into 23 clonal complexes (CC; 354 isolates) and 33 singletons (46 isolates). BURP clustering of spa types using all possible parameter combinations and subsequent comparison with eBURST CCs resulted in concordances ranging from 8.2 to 96.2%. However, 96.2% concordance was reached only if spa types shorter than 8 repeats were excluded, which resulted in 37% excluded spa types. Therefore, the optimal combination of the BURP parameters was "exclude spa types shorter than 5 repeats" and "cluster spa types into spa-CC if cost distances are less than 4" exhibiting 95.3% concordance to eBURST. This algorithm identified 24 spa-CCs, 40 singletons, and excluded only 7.8% spa types. Analyzing the natural population with these parameters, the comparison of whole-genome micro-array groupings ( at the level of 0.31 Pearson correlation index) and spa-CCs gave a concordance of 87.1%; BURP spa-CCs vs. manually grouped spa types resulted in 95.7% concordance. Conclusion: BURP is the first automated and objective tool to infer clonal relatedness from spa repeat regions. It is able to extract an evolutionary signal rather congruent to MLST and micro-array data

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Publications at Bielefeld University

Complete Alternative Splicing Events Are Bubbles in Splicing Graphs

Author: Fu X.Y.
Goux-Pelletan M.
Grasso C.
Michael Sammeth
Streuli M.
Sugnet C.W.
Publication venue: 'Mary Ann Liebert Inc'
Publication date
Field of study

Crossref

The effects of death and post-mortem cold ischemia on human tissue transcriptomes

Author: Aguet François
Amador Raziel
Amadoz Alicia
Ardlie Kristin G.
Breschi Alessandra, 1988-
Carbonell-Caballero Jose
Curado Joao
Dopazo Joaquín
Ferreira Pedro G.
Godinho Caio P. Sá
Guigó Serra Roderic
Hidalgo Marta R.
Muñoz-Aguirre Manuel
Nurtdinov Ramil
Oliveira Carla
Oliveira Patrícia
Pervouchine Dmitri D.
Reverter Ferran
Sammeth Michael
Sodaei Reza, 1988-
Sousa Abel
Çubut Cankut
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

Post-mortem tissues samples are a key resource for investigating patterns of gene expression. However, the processes triggered by death and the post-mortem interval (PMI) can significantly alter physiologically normal RNA levels. We investigate the impact of PMI on gene expression using data from multiple tissues of post-mortem donors obtained from the GTEx project. We find that many genes change expression over relatively short PMIs in a tissue-specific manner, but this potentially confounding effect in a biological analysis can be minimized by taking into account appropriate covariates. By comparing ante- and post-mortem blood samples, we identify the cascade of transcriptional events triggered by death of the organism. These events do not appear to simply reflect stochastic variation resulting from mRNA degradation, but active and ongoing regulation of transcription. Finally, we develop a model to predict the time since death from the analysis of the transcriptome of a few readily accessible tissues.Peer ReviewedPostprint (published version

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Directory of Open Access Journals

UPF Digital Repository

Queen Mary Research Online

Repositório Aberto da Universidade do Porto

Fondo Bibliográfico Digital Institucional

Gene expansion shapes genome architecture in the human pathogen Lichtheimia corymbifera: an evolutionary genomics analysis in the ancient terrestrial mucorales (Mucoromycotina)

Author: Axel A Brakhage
Ekaterina Shelest
Fabian Horn
Ilse D Jacobsen
Jörg Linde
Kerstin Kaerger
Kerstin Voigt
Konstantin Riege
Manja Marz
Marina Marcet-Houben
Michael Sammeth
Minou Nowrousian
Sascha Winter
Sebastian Böcker
Stefanie Wehner
Toni Gabaldón
Vito Valiante
Volker U Schwartze
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 14/08/2014
Field of study

Lichtheimia species are the second most important cause of mucormycosis in Europe. To provide broader insights into the molecular basis of the pathogenicity-associated traits of the basal Mucorales, we report the full genome sequence of L. corymbifera and compared it to the genome of Rhizopus oryzae, the most common cause of mucormycosis worldwide. The genome assembly encompasses 33.6 MB and 12,379 protein-coding genes. This study reveals four major differences of the L. corymbifera genome to R. oryzae: (i) the presence of an highly elevated number of gene duplications which are unlike R. oryzae not due to whole genome duplication (WGD), (ii) despite the relatively high incidence of introns, alternative splicing (AS) is not frequently observed for the generation of paralogs and in response to stress, (iii) the content of repetitive elements is strikingly low (<5%), (iv) L. corymbifera is typically haploid. Novel virulence factors were identified which may be involved in the regulation of the adaptation to iron-limitation, e.g. LCor01340.1 encoding a putative siderophore transporter and LCor00410.1 involved in the siderophore metabolism. Genes encoding the transcription factors LCor08192.1 and LCor01236.1, which are similar to GATA type regulators and to calcineurin regulated CRZ1, respectively, indicating an involvement of the calcineurin pathway in the adaption to iron limitation. Genes encoding MADS-box transcription factors are elevated up to 11 copies compared to the 1–4 copies usually found in other fungi. More findings are: (i) lower content of tRNAs, but unique codons inL. corymbifera, (ii) Over 25% of the proteins are apparently specific for L. corymbifera. (iii) L. corymbifera contains only 2/3 of the proteases (known to be essential virulence factors) in comparision to R. oryzae. On the other hand, the number of secreted proteases, however, is roughly twice as high as in R. oryzae

Stirling Online Research Repository (RIOXX)

Directory of Open Access Journals

PubMed Central

Portsmouth University Research Portal (Pure)

Stirling Online Research Repository

FigShare

Sequence variation between 462 human individuals fine-tunes functional sites of RNA processing

Author: Barann Matthias
Esteve-Codina Anna
Ezquina Suzana
Ferreira Pedro G.
Friedlander Marc R.
GEUVADIS Consortium
Guigo Roderic
Lappalainen Tuuli
Oti Martin
Palotie A.
Rivas Manuel A.
Rosenstiel Philip
Sammeth Michael
Strom Tim M.
Wieland Thomas
Publication venue
Publication date: 01/01/2016
Field of study

A. Palotie on työryhmän GEUVADIS Consortium jäsen.Recent advances in the cost-efficiency of sequencing technologies enabled the combined DNA-and RNA-sequencing of human individuals at the population-scale, making genome-wide investigations of the inter-individual genetic impact on gene expression viable. Employing mRNA-sequencing data from the Geuvadis Project and genome sequencing data from the 1000 Genomes Project we show that the computational analysis of DNA sequences around splice sites and poly-A signals is able to explain several observations in the phenotype data. In contrast to widespread assessments of statistically significant associations between DNA polymorphisms and quantitative traits, we developed a computational tool to pinpoint the molecular mechanisms by which genetic markers drive variation in RNA-processing, cataloguing and classifying alleles that change the affinity of core RNA elements to their recognizing factors. The in silico models we employ further suggest RNA editing can moonlight as a splicing-modulator, albeit less frequently than genomic sequence diversity. Beyond existing annotations, we demonstrate that the ultra-high resolution of RNA-Seq combined from 462 individuals also provides evidence for thousands of bona fide novel elements of RNA processing-alternative splice sites, introns, and cleavage sites-which are often rare and lowly expressed but in other characteristics similar to their annotated counterparts.Peer reviewe

PubMed Central

UPF Digital Repository

Digital.CSIC

Diposit Digital de Documents de la UAB

Helsingin yliopiston digitaalinen arkisto

Archive ouverte UNIGE